Implementing One to Many Data Linkage Using One Class Clustering Tree

نویسنده

  • M.REENA BANU
چکیده

The task of data linkage is performed among entities of the same type. The one to one data linkage links one record from one table and another one record in another table. It is extremely necessary to develop linkage techniques that link between matching entities of different types and also to improve one to one linkage to one to many data linkage as well. The proposed method emphasizes on one-class clustering tree (OCCT). This method characterizes the entities that should be linked together. This method enables easy understanding and transformation of the clusters into association rules. The association rules indicate that the inner nodes consist of only the features describing the first set of entities, while the leaves of the tree represent features of their matching entities from the second data set. The four splitting criteria, coarse grained jaccard coefficient , fine grained jaccard coefficient, least possible intersection and two different pruning methods which can be used for inducing the OCCT.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Data Linkage of Different Entities Using Occt-One Class Clustering Tree

A new one to many and many to many data linkage is based on a One-Class Clustering Tree (OCCT) which characterizes the entities that should be linked together. It is evaluated using datasets of Data leakage prevention, Recommender system and Fraud detection. The tree is built such that it is easy to understand and transform into Association rules. The Data Linkage is closely related to entity r...

متن کامل

Improved One-to-Many Record Linkage using One Class Clustering Tree

Record linkage is traditionally performed among the entities of same type. It can be done based on entities that may or may not share a common identifier. In this paper we propose a new linkage method that performs linkage between matching entities of different data types as well. The proposed technique is based on one-class clustering tree that characterizes the entities which are to be linked...

متن کامل

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

A Decision Tree Based Record Linkage for Recommendation Systems

Record linkage merges all the records relating to the same entity from multiple datasets, at the entity level. It is the initial data preparation phase for most of the database projects. Traditionally one to one data linkage is performed among the entities of same type with common unique identifier. The proposed one to many and/or many to many record linkage method is able to link the entities ...

متن کامل

Monotone Linkage Clustering and Quasi-Convex Set Functions

Greedily seriating objects one by one is implicitly employed in many heuristic clustering procedures, which can be described in terms of a linkage function measuring entity-to-set dissimilarities. A well-known clustering technique, single linkage clustering, can be considered as an example of the seriation procedures (actually, based on the minimum spanning tree construction) leading to the glo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014